21 research outputs found

    Inverse correspondence analysis

    Get PDF
    AbstractIn correspondence analysis (CA), rows and columns of a data matrix are depicted as points in low-dimensional space. The row and column profiles are approximated by minimizing the so-called weighted chi-squared distance between the original profiles and their approximations, see for example, [Theory and applications of correspondence analysis, Academic Press, New York, 1984]. In this paper, we will study the inverse CA problem, that is, the possibilities for retrieving one or more data matrices from a low-dimensional CA solution. We will show that there exists a nonempty closed and bounded polyhedron of such matrices. We also present two algorithms to find the vertices of the polyhedron: an exact algorithm that finds all vertices and a heuristic approach for larger sized problems that will find some of the vertices. A proof that the maximum of the Pearson chi-squared statistic is attained at one of the vertices is given. In addition, it is discussed how extra equality constraints on some elements of the data matrix can be imposed on the inverse CA problem. As a special case, we present a method for imposing integer restrictions on the data matrix as well. The approach to inverse CA followed here is similar to the one employed by De Leeuw and Groenen [J. Classification 14 (1997) 3] in their inverse multidimensional scaling problem

    Memento for interprofessional learning

    Get PDF

    Simplification and Shift in Cognition of Political Difference: Applying the Geometric Modeling to the Analysis of Semantic Similarity Judgment

    Get PDF
    Perceiving differences by means of spatial analogies is intrinsic to human cognition. Multi-dimensional scaling (MDS) analysis based on Minkowski geometry has been used primarily on data on sensory similarity judgments, leaving judgments on abstractive differences unanalyzed. Indeed, analysts have failed to find appropriate experimental or real-life data in this regard. Our MDS analysis used survey data on political scientists' judgments of the similarities and differences between political positions expressed in terms of distance. Both distance smoothing and majorization techniques were applied to a three-way dataset of similarity judgments provided by at least seven experts on at least five parties' positions on at least seven policies (i.e., originally yielding 245 dimensions) to substantially reduce the risk of local minima. The analysis found two dimensions, which were sufficient for mapping differences, and fit the city-block dimensions better than the Euclidean metric in all datasets obtained from 13 countries. Most city-block dimensions were highly correlated with the simplified criterion (i.e., the left–right ideology) for differences that are actually used in real politics. The isometry of the city-block and dominance metrics in two-dimensional space carries further implications. More specifically, individuals may pay attention to two dimensions (if represented in the city-block metric) or focus on a single dimension (if represented in the dominance metric) when judging differences between the same objects. Switching between metrics may be expected to occur during cognitive processing as frequently as the apparent discontinuities and shifts in human attention that may underlie changing judgments in real situations occur. Consequently, the result has extended strong support for the validity of the geometric models to represent an important social cognition, i.e., the one of political differences, which is deeply rooted in human nature

    A flexible framework for sparse simultaneous component based data integration

    Get PDF
    <p>Abstract</p> <p>1 Background</p> <p>High throughput data are complex and methods that reveal structure underlying the data are most useful. Principal component analysis, frequently implemented as a singular value decomposition, is a popular technique in this respect. Nowadays often the challenge is to reveal structure in several sources of information (e.g., transcriptomics, proteomics) that are available for the same biological entities under study. Simultaneous component methods are most promising in this respect. However, the interpretation of the principal and simultaneous components is often daunting because contributions of each of the biomolecules (transcripts, proteins) have to be taken into account.</p> <p>2 Results</p> <p>We propose a sparse simultaneous component method that makes many of the parameters redundant by shrinking them to zero. It includes principal component analysis, sparse principal component analysis, and ordinary simultaneous component analysis as special cases. Several penalties can be tuned that account in different ways for the block structure present in the integrated data. This yields known sparse approaches as the lasso, the ridge penalty, the elastic net, the group lasso, sparse group lasso, and elitist lasso. In addition, the algorithmic results can be easily transposed to the context of regression. Metabolomics data obtained with two measurement platforms for the same set of <it>Escherichia coli </it>samples are used to illustrate the proposed methodology and the properties of different penalties with respect to sparseness across and within data blocks.</p> <p>3 Conclusion</p> <p>Sparse simultaneous component analysis is a useful method for data integration: First, simultaneous analyses of multiple blocks offer advantages over sequential and separate analyses and second, interpretation of the results is highly facilitated by their sparseness. The approach offered is flexible and allows to take the block structure in different ways into account. As such, structures can be found that are exclusively tied to one data platform (group lasso approach) as well as structures that involve all data platforms (Elitist lasso approach).</p> <p>4 Availability</p> <p>The additional file contains a MATLAB implementation of the sparse simultaneous component method.</p

    Seriation by constrained correspondence analysis: a simulation study

    No full text
    One of the many areas in which correspondence analysis (CA) is an effective method, concerns seriation problems. For example, CA is a well-known technique for the seriation of archaeological assemblages. A problem with the CA seriation solution, however, is that only a relative ordering of the assemblages is obtained. To improve the usual CA solution, a constrained CA approach that incorporates additional information in the form of equality and inequality constraints concerning the time points of the assemblages may be considered. Using such constraints, explicit dates can be assigned to the seriation solution. The set of constraints that can be used in CA by introducing interval constraints is extended. That is, constraints that put the CA solution within a specific time frame. Moreover, the quality of the constrained CA solution is studied in a simulation study. In particular, by means of the simulation study we are able to assess how well ordinary, and constrained CA can recover the true time order. Furthermore, for the constrained approach, it is shown that the true dates are retrieved satisfactory. The simulation study is set up in such a way that it mimics the data of a series of ceramic assemblages consisting of the locally produced tableware from Sagalassos (SW Turkey). It is found that the dating of the assemblages on the basis of constraints appears to work quite well. © 2008 Elsevier B.V. All rights reserved.status: publishe

    A Censored Mixture Model for Modeling Risk Taking

    Get PDF
    Risk behavior has substantial consequences for health, well-being, and general behavior. The association between real-world risk behavior and risk behavior on experimental tasks is well documented, but their modeling is challenging for several reasons. First, many experimental risk tasks may end prematurely leading to censored observations. Second, certain outcome values can be more attractive than others. Third, a priori unknown groups of participants can react differently to certain risk-levels. Here, we propose the censored mixture model which models risk taking while dealing with censoring, attractiveness to certain outcomes, and unobserved individual risk preferences, next to experimental conditions

    Applying Unfolding in the Construction of Non-linear Biplot Trajectories. (Papers: International conference)

    No full text
    Please help us populate SUNScholar with the post print version of this article. It can be e-mailed to: [email protected] En BestuurswetenskappeStatistiek & Aktuariele Wetenska

    Nonlinear Biplots with a Shortest Distance Interpretation.

    No full text
    Please help us populate SUNScholar with the post print version of this article. It can be e-mailed to: [email protected] En BestuurswetenskappeStatistiek & Aktuariele Wetenska
    corecore